December 22, 2020

Outline

  • Neovim-based IDE for R
  • Parallel R with batchtools

R on Cluster of HPCC

  • Knowledge of command-line interface is essential for working on a computer cluster efficiently
  • Advantage: language agnostic approach that works with most computer languages. GUI applications like RStudio are limited to R.
  • This tutorial introduces the Nvim-R-Tmux environment
  • Users of Emacs may want to consider using ESS instead.
  • Alternative for beginners: RStudio Server instance on HPCC/biocluster

Nim-R-Tmux: Terminal-based R Environment

Animated Screenshot of Nvim-R (from here):

Introduction to Nvim-R-Tmux

  • Continue on Nvim-R-Tmux tutorial here
  • After ssh into cluster headnode, log in to one of the nodes
srun --x11 --partition=short --mem=2gb --cpus-per-task 4 --ntasks 1 --time 1:00:00 --pty bash -l

Upload demo file to HPCC/biocluster:

wget https://raw.githubusercontent.com/ucr-hpcc/ucr-hpcc.github.io/master/_support_docs/tutorials/nvim_demo.R

Outline

  • Basics
  • Scalable Complexity via Scrolling
  • Images and Graphics
  • References

Scrolling within Code Blocks, Tables and Beyond Slide Boundaries

  • Scrolling of code chunks supported by css code after preamble.
z <- "dajfdfkfffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff"
z
z
z
z
z
z
z
z
z
  • Note: the print behavior of data.frames is autmatically paged when including df_print: paged in preamble. In addition, one can set how many rows are shown on each page by assigning the desired number to the rows.print argument in the header of the corresponding code chunk (e.g. below it is set to 75 rows).
x <- cbind(iris, iris[,5:1])
x

Job Submission with sbatch

Print information about queues/partitions available on a cluster.

sinfo 

Compute jobs are submitted with sbatch via a submission script (here script_name.sh).

sbatch script_name.sh

Sample submission script

#!/bin/bash -l

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1G
#SBATCH --time=1-00:15:00 # 1 day and 15 minutes
#SBATCH --mail-user=useremail@address.com
#SBATCH --mail-type=ALL
#SBATCH --job-name="some_test"
#SBATCH -p batch # Choose queue/parition from: intel, batch, highmem, gpu, short

myscript.sh

Interactive session with specific resource requests

srun --x11 --partition=short --mem=2gb --cpus-per-task 4 --ntasks 1 --time 1:00:00 --pty bash -l

DataTables support

library(DT)
datatable(iris)

Font Size

  • Note: the default in ioslides uses larger font sizes where 'smaller: false' is used. You usually have it set to 'smaller: true'
  • With default turned one can also set smaller font sizes on a per slide basis by specifying '{.smaller}' at the end of a slide title
  • To have more fine control over font size use embedded HTML code. Here are some examples:
    • HTML font size 28px
    • HTML font size 18px
    • HTML font size 14px
    • HTML font size 12px

Center Text

  • To vertically center content, use the {.flexbox .vcenter} option after the title of a slide
  • HTML tags can also be used.

Two Column Layout

This can be useful to have a figure on the right and bullets describing it on the left.

  • Bullet 1
  • Bullet 2
  • Bullet 3

Outline

  • Basics
  • Scalable Complexity via Scrolling
  • Images and Graphics
  • References

Images

Drawing



  • Using HTML code to insert image is most flexible

Background Images

  • Bullet 1
  • Bullet 2
  • Bullet 3

Real-time Graphics Code Evaluation

library(dplyr); library(ggplot2); library(reshape2)
iris %>% 
    group_by(Species) %>% 
    summarize_all(mean) %>% 
    reshape2::melt(id.vars=c("Species"), variable.name = "Samples", value.name="Values") %>%
    ggplot(aes(Samples, Values, fill = Species)) + 
    geom_bar(position="dodge", stat="identity")

Outline

  • Basics
  • Scalable Complexity via Scrolling
  • Images and Graphics
  • References

References